-
Notifications
You must be signed in to change notification settings - Fork 53
Conversation
When automatically switching a BasicDirectory to a HAMTDirectory because it grew too big, the current code: * loops every link in the BasicDirectory * reads each node referenced by those links * adds the nodes to a new HAMTDirectory shard, which in turn: * writes the nodes to the DAG service (they were just read from there!) * makes a link out of them (identical to the link in the BasicDirectory!) This would happen to about (~4000 nodes), which are fully read and written for nothing. This PR adds a new SetLink method to the HAMT Shard which, instead of taking an ipld.Node like Set(), takes directly an ipld.Link. Then it updates switchToSharding() to pass the links in the BasicDirectory directy, rather than reading all the nodes. Note that switchToBasic() works like this already, only using the links in the HAMT directory.
// FIXME: We don't need to set the name here, it will get overwritten. | ||
// This is confusing, confirm and remove this line. | ||
newLink.Name = ds.linkNamePrefix(0) + name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copy-pasted from Set() above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks 🙏.
// given link. This avoids writing the given node, then reading it to making a | ||
// link out of it. | ||
func (ds *Shard) SetLink(ctx context.Context, name string, lnk *ipld.Link) error { | ||
hv := newHashBits(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Not required, but if in the future someone wants to extract this into reusable ds.SwapLink
function for similar types of optimizations they can do so. Although it's really just a few copied lines of code.
Thank you for merging so quickly. |
When automatically switching a BasicDirectory to a HAMTDirectory because it
grew too big, the current code:
This would happen to about (~4000 nodes), which are fully read and written for nothing.
This PR adds a new SetLink method to the HAMT Shard which, instead of taking
an ipld.Node like Set(), takes directly an ipld.Link. Then it updates
switchToSharding() to pass the links in the BasicDirectory directy, rather
than reading all the nodes.
Note that switchToBasic() works like this already, only using the links in the
HAMT directory.